Columnar storage and list-based processing for graph database management systems

نویسندگان

چکیده

We revisit column-oriented storage and query processing techniques in the context of contemporary graph database management systems (GDBMSs). Similar to RDBMSs, GDBMSs support read-heavy analytical workloads that however have fundamentally different data access patterns than traditional workloads. first derive a set desiderata for optimizing processors GDBMS based on their patterns. then present design columnar storage, compression, these desiderata. In addition showing direct integration existing from we also propose novel ones are optimized GDBMSs. These include list-based processor, which avoids expensive copies block-based under many-to-many joins, new structure call single-indexed edge property pages an accompanying ID scheme, application Jacobson's bit vector index compressing NULL values empty lists. integrated our into GraphflowDB in-memory GDBMS. Through extensive experiments, demonstrate scalability performance benefits techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building a Columnar Database on Shared Main Memory-Based Storage

In the field of disk-based parallel database management systems exists a great variety of solutions based on a shared-storage or a shared-nothing architecture. In contrast, main memory-based parallel database management systems are dominated solely by the shared-nothing approach as it preserves the in-memory performance advantage by processing data locally on each server. We argue that this uni...

متن کامل

Hybrid Storage Management for Database Systems

The use of flash-based solid state drives (SSDs) in storage systems is growing. Adding SSDs to a storage system not only raises the question of how to manage the SSDs, but also raises the question of whether current buffer pool algorithms will still work effectively. We are interested in the use of hybrid storage systems, consisting of SSDs and hard disk drives (HDDs), for database management. ...

متن کامل

Ontology Based Query Processing in Database Management Systems

The use of semantic knowledge in its various forms has become an important aspect in managing data in database and information systems. In the form of integrity constraints, it has been used intensively in query optimization for some time. Similarly, data integration techniques have utilized semantic knowledge to handle heterogeneity for query processing on distributed information sources in a ...

متن کامل

BPP: Large Graph Storage for Efficient Disk Based Processing

Processing very large graphs like social networks, biological and chemical compounds is a challenging task. Distributed graph processing systems process the billion-scale graphs efficiently but incur overheads of efficient partitioning and distribution of the graph over a cluster of nodes. Distributed processing also requires cluster management and fault tolerance. In order to overcome these pr...

متن کامل

Graph Database Systems for Genomics

Genome databases have specific requirements which limit the usefulness of some database management systems. By using more appropriate database technology, a database system can be developed for genome data. We have developed a data representation based on graph theory which captures the highly interconnected structure of genome data. Graphs are a language which can be tailored for describing ge...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the VLDB Endowment

سال: 2021

ISSN: ['2150-8097']

DOI: https://doi.org/10.14778/3476249.3476297